Picture for Jinjie Gu

Jinjie Gu

Bridging the Detection-to-Abstention Gap in Reasoning Models under Insufficient Information

Add code
May 27, 2026
Viaarxiv icon

LATTE: Forecasting Peer Anchored Preference Trajectories for Personalized LLM Generation

Add code
May 26, 2026
Viaarxiv icon

MedMemoryBench: Benchmarking Agent Memory in Personalized Healthcare

Add code
May 12, 2026
Viaarxiv icon

LiveAgentBench: Comprehensive Benchmarking of Agentic Systems Across 104 Real-World Challenges

Add code
Mar 03, 2026
Viaarxiv icon

LiveClin: A Live Clinical Benchmark without Leakage

Add code
Feb 18, 2026
Viaarxiv icon

WebClipper: Efficient Evolution of Web Agents with Graph-based Trajectory Pruning

Add code
Feb 13, 2026
Viaarxiv icon

ClinAlign: Scaling Healthcare Alignment from Clinician Preference

Add code
Feb 11, 2026
Viaarxiv icon

V2P: Visual Attention Calibration for GUI Grounding via Background Suppression and Center Peaking

Add code
Jan 11, 2026
Viaarxiv icon

MedDialogRubrics: A Comprehensive Benchmark and Evaluation Framework for Multi-turn Medical Consultations in Large Language Models

Add code
Jan 07, 2026
Viaarxiv icon

Perplexity-Aware Data Scaling Law: Perplexity Landscapes Predict Performance for Continual Pre-training

Add code
Dec 25, 2025
Viaarxiv icon